home *** CD-ROM | disk | FTP | other *** search
- CONVERT.DOC 1 Revised: 07-19-96
-
- The CONVERT.EXE program converts data between several basic data formats.
- Features:
-
- * You can read data in from any of the following data formats:
- ASCII-delimited
- fixed field
- dBase-compatible
- * You can write data out in any of the following data formats:
- ASCII-delimited
- fixed field
- dBase-compatible
- WKS (Lotus 1-2-3 release 1 compatible)
- * For ASCII-delimited files, you can specify the delimiters used between
- fields as well as around numeric or character data.
- * For dBase input files, you can retain deleted records if you want.
- * You can resize variables or drop them entirely if desired.
- * You can add fields as desired and assign initial values to them.
- * You can specify up to 10 include filters; all records processed must meet
- at least one of these filter conditions.
- * You can specify up to 10 exclude filters; any records which meet these
- filter conditions are dropped.
- * For WKS output files, you can process input files bigger than Lotus
- 1-2-3 itself can handle (1-2-3 limits input records to being 240 characters
- or less).
- * Cell filters can be created, ignoring, say, any record where the value in
- cell 4 is less than 10.
- * The program can only handle files with 255 fields or fewer. (Those
- obsessed with handling more fields can ask for a version that does so.)
- * Handles DOS text files (lines end with CR/LF), Mac text files (lines end
- with CR), or Unix text files (lines end with LF).
-
-
-
- CONVERT.DOC 2 Revised: 07-19-96
-
- Data format types:
-
- An ASCII-delimited file is one which typically has double quotes around each
- character field (the quotes are optional in CONVERT) and typically commas
- between fields. Leading and trailing spaces are removed from character as well
- as all other values.
-
- A fixed-field file places each field in the same column positions on each
- record. The lengths of the fields are the same from record to record.
-
- As an example, these might be an ASCII-delimited records:
-
- "Economic Bulletin Board","202 482-3870",35
- "EBB High-Speed","202 482-2584",100
-
- Fixed-field file records might look like this for the same data:
-
- Economic Bulletin Board 202 482-3870 35
- EBB High-Speed 202 482-2584 100
-
- (On files generated on a PC, most fixed-field file records end with a CR/LF
- combination. Although these two characters actually add two characters to each
- line, most people and programming languages ignore them. Files generated
- elsewhere may not have these line terminators. Use the /BINARY option if this
- is the case.)
-
- WKS files are supported directly by Lotus 1-2-3 (all versions) as well as most
- other spreadsheet programs. DBF files are supported by dBaseIII, dBaseIII+,
- dBaseIV, and most other data base management programs (Paradox etc).
-
-
-
- CONVERT.DOC 3 Revised: 07-19-96
-
- Field-definition file:
-
- Unless you are reading a dBase file, this program requires a field-definition
- file to figure out the characteristics for each field and also to set certain
- file characteristics. If you're processing an ASCII-delimited input file, the
- routine can try to create a field-definition file for you if desired.
-
- The field-definition file can be created with any text editor. The
- field-definition file consists of several records with the following fields
- separated by spaces. Except for the record type indicator (which must begin in
- column 1), all other fields can be placed in any columns:
-
- (1) record type (see below)
- (2) length of field on input
- (3) number of decimal places for numeric data (if you don't know,
- put a "?" here; for non-numeric data, a "0" is fine) on output
- (4) length of field on output
-
- Any characters after the field length are treated as comment fields. You would
- typically use this to enter the field name or column position or any other
- information of use to you.
-
- If you'd like, you can leave out both decimal place count *and* the length of
- the output field. (You cannot leave off one and include the other however.) If
- they are left out, the number of decimal places is presumed to be "0" and the
- length of the output field is presumed to be the same as the length of the input
- field.
-
- The data record types accepted by this routine are as follows:
-
- type C = character data
- N = numeric
- L = logical (T or F)
- D = date (in yyyymmdd format)
- M = memo fields (only for dBase input files; ignored on output)
-
- The data fields should be in the order the fields are found in the source file.
-
- Note that for fixed field files, you have to account for every byte in the file.
- If you have something like this:
-
- 12345678_1_2345678_2_2345678_3 (column positions)
- APPLE X Y 12 BANANAS
-
- Even though you may think you only have five fields, the following .DEF file
- will NOT work:
-
- ; Bad .DEF file: Note does not account for blank spaces
- ; NOTE: Input and output lengths are the same (mistakenly) so left
- ; out number of decimal places and output length for each record.
- C 8 Fruit1
- C 1 Class1
- C 1 Class2
- N 5 Value
- C 11 Fruit2
-
-
- CONVERT.DOC 4 Revised: 07-19-96
-
- You may want the Fruit1 field to be in columns 1 through 8 and Class1 to be in
- column 10 but the routine will not know to skip column 9 so it will start
- reading Class1 beginning in column 9, Class2 beginning in column 10, etc. To
- drop the blank positions, you have to add dummy fields on input and ask for them
- to be dropped on output:
-
- ; Good .DEF file: Spaces between fields are accounted for
- C 8 Fruit1
- C 1 0 0 Filler
- C 1 Class1
- C 1 0 0 Filler
- C 1 Class2
- C 1 0 0 Filler
- N 5 Value
- C 1 0 0 Filler
- C 11 Fruit2
-
- You can also use the input field length and output field lengths to either drop
- fields using other formats (by specifying a zero length for the output field
- length) or for creating fields on output (by specifying a zero length for the
- input field length). You can also use this to expand on contract a field. For
- example, if Fruit1 is 8 characters long but you only want it to occupy 4
- characters on output (thus the field would be truncated), specify 8 for the
- input field length and 4 for the output field length:
-
- C 8 0 4 Fruit
-
- If the output field length is wider than the input field length, the data values
- will be shifted right or left depending on the data type. In general, numeric
- fields are shifted right (so extra spaces show up in front of the number) and
- all other field types (character, logical, or date) are shifted left.
-
- You can have the routine create the field-definition file it's using for dBase
- and ASCII-delimited files. This is controlled by the /OUTDEF=deffile and
- /-OUTDEF parameters.
-
-
- Filters:
-
- CONVERT supports two types of filters; record filters and cell (field) filters.
-
- Record filters:
-
- In general, record filters apply to the record as a whole. They are specified
- as unique types of records in the field-definition file.
-
- You may specify up to 10 include record filters and up to 10 exclude record
- filters in the field-definition file.
-
-
- CONVERT.DOC 5 Revised: 07-19-96
-
- If an include record filter is specified, the input record must contain at
- least one of the specified character strings in order to be processed. If an
- exclude record filter is specified, any input record which contains any of the
- specified character strings will be ignored. Record filters are case
- sensitive (capitalization matters) and processed as "or" items (if any filter
- is met, the condition is met; record filters are not combined to determine
- fulfillment). Record filters are specified in the field-definition file in the
- following ways:
-
- /+=string include filter, the string "string" can appear anywhere
- /S+=string include filter, the string "string" is at the beginning
- of the record
- /+S=string include filter, the string "string" is at the end of the
- record
- /-=string exclude filter, the string "string" can appear anywhere
- /S-=string exclude filter, the string "string" is at the beginning
- of the record
- /-S=string exclude filter, the string "string" is at the end of the
- record
-
- For example, if you want to specify in your control file that you only want
- records that contain either "Japan" or "France" *and* you want to exclude any
- records that begin with an underscore character, you would need to include the
- following three filter statements:
-
- /+=Japan
- /+=France
- /S-=_
-
- The character string can include hexadecimal codes (in the &Hxx format) or
- decimal codes (in the \ddd format) if necessary. See BRUCEHEX.DOC file.
-
- Cell filters:
-
- Cell filters are applied to individual cells (or fields) in the data. If any
- cell fails the test for that cell, the entire record is skipped for additional
- processing.
-
- Each variable in the field-definition file can include a cell filter. Filters
- are specified on the field type records as the last parameters on the record
- and are immediately preceded with "||" indicators:
-
- N 8 Weight || > 100
- N 4 Height || <= 6
- C 10 Name || = Banana
-
- Cell filters can be specified as any one of six relations:
-
- "=" is equal to
- "<" is less than
- ">" is greater than
- "<=" is less than or equal to
- ">=" is greater than or equal to
- "<>" is not equal to
-
-
- CONVERT.DOC 6 Revised: 07-19-96
-
- You cannot specify ranges. The item to the right of the relation is treated
- as a string if the field is non-numeric, otherwise, it's treated as a number.
- Do not include quotes around the strings unless you want that as part of the
- condition. The value can include hexadecimal codes (in the &Hxx format) or
- decimal codes (in the \ddd format) if necessary (see BRUCEHEX.DOC file). It
- can also contain spaces. The cell filter is case sensitive (capitalization
- matters).
-
- If the field is being created, you can specify an assignment cell filter
- (using "=") which will set the value of that cell as something. For example:
-
- C 10 First Name
- C 0 0 1 Middle Initial || = ?
- C 10 0 20 Last Name
-
-
- Field-definition file for SimTel archives:
-
- People who use the SimTel archives (ftp.coast.net or
- http://www.coast.net/SimTel) may find the enclosed SIMIBM.DEF file useful. It
- provides the field definitions for the standard SIMIBM.IDX file (from
- subdirectory SimTel/filedocs, download simindex.zip). The file includes some
- hints for dropping fields that don't seem to be useful as well as restricting
- the listing to just those files that have been added since a given date.
-
-
- Specifying parameters:
-
- Parameters for this program can be set in the following ways. The last setting
- encountered always wins:
- - Read from an *.INI file (see BRUCEINI.DOC file),
- - Through the use of an environmental variable (SET CONVERT=whatever), or
- - From the command line (see "Syntax" below)
-
-
-
- CONVERT.DOC 7 Revised: 07-19-96
-
- Syntax:
-
- CONVERT infile [ outfile [ deffile ] ]
- [ /INDEF=deffile | /-INDEF ] [ /OUTDEF=deffile | /OUTDEF | /-OUTDEF ]
- [ /OVERWRITE | /-OVERWRITE | /APPEND | /OVERASK ]
- [ /FROM FIXED | /FROM ASCII | /FROM DBF ] [ /DELETED | /-DELETED ]
- [ /TO FIXED | /TO ASCII | /TO WKS | /TO DBF ] [ /HEADER | /-HEADER ]
- [ /DELIMS=aroundstrings,aroundnums,betweenfields ] [ /BEEP ]
- [ /INMISS=val ] [ /INMISSC=val ] [ /OUTMISS=val ] [ /OUTMISSC=val ]
- [ /SKIP | /MISSING | /ABORT ] [ /FIRSTOBS=n ] [ /LASTOBS=n ] [ /-VER ]
- [ /-NULLS ] [ /BINARY ] [ /SCAN=n ] [ /GAP=n ] [ /-CFILTERS ]
- [ /MONO ] [ /Iinitfile | /-I ] [ /Q ] [ /? ] [ /?&H ]
-
- "infile" is the file specification for the ASCII-delimited or fixed-field file
- you want converted. You can specify a drive and path specification if
- necessary. This parameter is required.
-
- "outfile" is the file specification of the file you want to create. You can
- specify a drive and path specification if necessary. If no outfile is provided,
- the routine will presume you want the output file called the same thing as the
- infile but you want the extension to be ".FIX" (if the output file is a
- fixed-field file), ".PRN" (if the output file is ASCII-delimited), or ".WKS" (if
- the output file is to be in a WKS format).
-
- "deffile" is the file specification for the input field-definition file. You
- can specify a drive and path specification if necessary. If no deffile is
- specifically provided, the routine will presume it is called the same thing as
- the infile but it has the extension of ".DEF". Note that the deffile can only
- be provided this way if you also specify the outfile name; the use of
- /INDEF=deffile is recommended instead.
-
- "/INDEF=deffile" provides the name of the field-definition file to be read by
- the program. If no deffile is specifically provided and you're using FROM
- ASCII, the routine will presume the field-definition file exists and is named
- the same thing as the infile but it has the extension of ".DEF". A deffile has
- to be specifically provided for FROM FIXED files if one is desired.
-
- "/-INDEF" says there is no field-definition file. This is the default if you're
- using FROM DBF or FROM FIXED.
-
- "/OUTDEF=deffile" provides the name of the output file that you want the program
- to write the field-definition file to. This is useful in cases where you did
- *not* use a field-definition file on input since it allows you to see and
- possibly modify the field-definition file for next time.
-
- "/OUTDEF" says to create the field-definition file and automatically name it for
- you. The file name is the same as the infile with an extension of ".DEF".
-
- "/-OUTDEF" says to skip the creation of the field-definition file. This is the
- default for ASCII-delimited and fixed-field files.
-
-
- CONVERT.DOC 8 Revised: 07-19-96
-
- "/OVERWRITE" says to overwrite the output file if it exists already.
-
- "/-OVERWRITE" says to abort if the output file exists already.
-
- "/APPEND" says to append (add) to the output file if it exists already. This
- option is only available if you're creating either a fixed-field or ASCII-
- delimited output file.
-
- "/OVERASK" says to ask if the output file exists already. This is initially the
- default.
-
- "/FROM FIXED", "/FROM ASCII", and "/FROM DBF" specifies the format for the input
- file. The routine usually reads the input file and guesses its format for you.
- If the routine guesses incorrectly, use the /-VER parameter (below) to overrule
- it.
-
- "/DELETED" applies to dBase input files only. It says you want to retain
- records tagged as "deleted". Otherwise, they're dropped in the output file.
-
- "/-DELETED" applies to dBase input files only. It says to drop records tagged
- as "deleted". This is initially the default.
-
- "/TO FIXED", "/TO ASCII", "/TO WKS", and "/TO DBF" tells the routine what sort
- of output file you'd like to create. The input and output file formats can be
- the same if desired but you have to explicity specify an output file name in
- that case. Initially defaults to "/TO FIX".
-
- "/HEADER" is used in conjunction with WKS output files. If /HEADER is in
- effect, the first row of the spreadsheet will contain the variable name for the
- cell as provided in your control file. If none are provided, the field names
- will be FIELD_01 onward. "/-HEADER" turns this off and is the default. Note
- that the header line (if any) will show up in the output counts.
-
- "/DELIMS=aroundstrings,aroundnums,betweenfields" allows you to specify the
- delimiters (in sequence) around string fields, around numeric fields (any fields
- that isn't a character field), and between fields. Defaults to:
-
- /DELIMS=",,,
-
- (Use quotes around character strings, nothing around numeric data, and the third
- comma indicates that there is a comma between fields.) The replacement string
- can include hexadecimal codes (in the &Hxx format) or decimal codes (in the \ddd
- format) if necessary (see BRUCEHEX.DOC file) so either of the following would
- put a tab between fields:
-
- /DELIMS=",,&H09
- /DELIMS=",,\009
-
- "/BEEP" beeps when the program is finished.
-
- "/-BEEP" reverses /BEEP. Initially the default.
-
-
- CONVERT.DOC 9 Revised: 07-19-96
-
- "/INMISS=val" specifies that any numeric value that has the character string
- representation of "val" will be considered missing. Note that this is an exact
- character string comparison so /INMISS=1 will not compare to a value of "1.00".
- Defaults to /INMISS=NULL (which translates as spaces).
-
- "/INMISSC=val" specifies that any character string value that has the value of
- "val" will be considered missing. Defaults to /INMISSC=NULL (which translates
- as spaces).
-
- "/OUTMISS=val" specifies that any missing numeric input value will be translated
- on output as "val". For example, "/OUTMISS=N.A." would fill in "N.A." for each
- missing value. Defaults to /OUTMISS=NULL (which translates as spaces).
-
- "/OUTMISSC=val" specifies that any missing character input value will be
- translated on output as "val". Defaults to /OUTMISSC=NULL (which translates as
- spaces).
-
- "/SKIP" says to skip records with bad data values; otherwise the routine aborts
- when it runs into any. /SKIP, /MISSING, and /ABORT are mutually exclusive.
-
- "/MISSING" says to presume any missing fields in an ASCII-delimited record
- should be filled in with blanks (for character fields) and 0 for numeric fields.
- Incomplete records are written out (unlike in /SKIP). /SKIP, /MISSING, and
- /ABORT are mutually exclusive. Note that the program will only print out the
- first "bad" record. There may be others that show up after this one.
-
- "/ABORT" says to abort when you run into bad records. Initially the default.
- /SKIP, /MISSING, and /ABORT are mutually exclusive.
-
- "/FIRSTOBS=n" says to start reading the data beginning with record number n.
- Initially defaults to "/FIRSTOBS=1".
-
- "/LASTOBS=n" says to stop reading the data after record number n. Initially
- defaults to "/LASTOBS=2000000000" (2 billion).
-
- "/-VER" is used when the program verifies your input file and mistakenly
- determines that it is a file type other than what it is. This sometimes happens
- with fixed-field input files even when /FROM FIXED is specified.
-
- "/NULLS" allows fields that begin with decimal 0 to be left in the output file
- as valid values. Otherwise, they're treated as being missing. /NULLS is
- initially the default.
-
- "/-NULLS" translates any field which begins with the decimal 0 value as missing.
- For character fields, it's translated to the value of INMISSC, numeric fields
- are switched to the INMISS value. /NULLS is initially the default.
-
- "/BINARY" says that you have a fixed-field input file and that the physical
- records in this file do not end with the normal CR/LF combination. This is
- fairly typical of files created on a mainframe or copied from a tape. If
- /BINARY is used, every byte must be precisely accounted for. /BINARY
- automatically invokes the /FROM FIXED option.
-
- "/-BINARY" says that every record ends with a CR/LF combination. This is fairly
- standard on files created on a PC, even fixed-field files. When /-BINARY is in
- effect, trailing fields are ignored. "/-BINARY" is initially the default.
-
-
- CONVERT.DOC 10 Revised: 07-19-96
-
- "/SCAN=n" says to read the first n-records when determining the input file
- characteristics. This is only relevant for /FROM ASCII files which use the
- /-INDEF option. The field types are based on the first record read but the
- maximum field widths are determined by reading the first n-records. Initially
- defaults to "/SCAN=10".
-
- "/GAP=n" specifies to add n-characters to each output field width. This is only
- relevant for /TO FIXED fields which use the /-INDEF option. In this case,
- n-characters are added to each output field length for non-numeric fields and
- extra blank fields are added before every numeric field. This is an easy way of
- splitting up columns on display. If you want non-uniform gaps, you can always
- process the file twice; specify /-INDEF with /OUTDEF the first time, modify the
- field-definition file by hand, and then reprocess the file with /INDEF=filename.
- Initially defaults to "/GAP=0".
-
- "/CFILTERS" says that if cell filters are specified in the field-definition file
- (e.g. "N 3 Age || > 5"), these cell filters are to be respected. This is
- initially the default. (If cell filters do not appear in the field-definition
- file at all, the switch is ignored.)
-
- "/-CFILTERS" says that if cell filters are specified in the field-definition
- file, they are to be ignored. This is useful sometimes if you want the whole
- file instead of just getting a subset of it. Note that this switch will also
- turn off initializing new fields (as in "N 0 0 3 Age || = 12").
-
- "/MONO" (or "/-COLOR") does not try to override screen colors. Initially
- defaults to "/COLOR".
-
- "/COLOR" (or "/-MONO") allows screen colors to be overridden. This is initially
- the default.
-
- "/Iinitfile" says to read an initialization file with the file name "initfile".
- The file specification *must* contain a period. Initfiles are described in the
- BRUCEINI.DOC file. Initially defaults to "/ICONVERT.INI".
-
- "/-I" (or "/INULL") says to skip loading the initialization file.
-
- "/Q" turns off the line-by-line status report.
-
- "/?" or "/HELP" or "HELP" shows you the syntax for the command.
-
- "/?&H" gives you a hexadecimal and decimal conversion table.
-
-
-
- CONVERT.DOC 11 Revised: 07-19-96
-
- Return codes:
-
- CONVERT returns the following ERRORLEVEL codes:
- 0 = no problems, file converted
- 250 = operation aborted by pressing Escape
- 251 = other problems
- 252 = problems with input data and /ABORT specified
- 253 = problems with INDEF or OUTDEF file
- 254 = could not find a decent temporary output subdirectory
- 255 = syntax problems, file(s) not found, output file already exists,
- /? requested
-
-
- Restrictions and Caveats:
-
- * The program skips all Memo fields in dBase files on input.
-
- * Most spreadsheet programs restrict a given field length to being 240
- characters or less.
-
- * Date fields which are in the form "yy-mm-dd" (instead of "yyyymmdd") should
- be declared as character fields instead of date fields.
-
- * Date fields converted for WKS files are changed into strings of the
- "yy-mm-dd" format.
-
- * Some users have reported problems using this program when they're default
- drive is a network drive instead of a local hard drive. If you get a
- "path/file access error", make C: your default drive and try again.
-
-
- Author:
-
- Bruce Guthrie
- Room H-4885
- U.S. Dept of Commerce/ESA/STAT-USA
- Washington, DC 20230
-
- voice: (202) 482-3234
- e-mail: bguthrie@doc.gov
-
- You may freely copy and re-distribute this program; however, the U.S. Department
- of Commerce neither guarantees nor assures compatibility of the program with all
- computer software or hardware.
-
- See BRUCEymm.DOC file for revision history.
-
- Please provide an Internet e-mail address on all correspondence.
-
-